我们考虑涉及一组代理的在线估计问题。每个代理都可以访问(个人)流程,该过程从实数分布中生成样本,并试图估算其平均值。我们研究了某些分布具有相同均值的情况,并且允许代理人积极查询其他代理商的信息。目的是设计一种算法,该算法使每个代理都能够通过与其他代理商进行沟通来改善其平均估计。平均值的均值和分布数量尚不清楚,这使得任务是非平凡的。我们介绍了一种新颖的协作策略,以解决这个在线个性化的平均估计问题。我们分析其时间复杂性,并引入在数值实验中享有良好性能的变体。我们还将我们的方法扩展到了具有相似手段的代理商群体寻求估算其群集的平均值的环境。
translated by 谷歌翻译
Machine learning is the dominant approach to artificial intelligence, through which computers learn from data and experience. In the framework of supervised learning, for a computer to learn from data accurately and efficiently, some auxiliary information about the data distribution and target function should be provided to it through the learning model. This notion of auxiliary information relates to the concept of regularization in statistical learning theory. A common feature among real-world datasets is that data domains are multiscale and target functions are well-behaved and smooth. In this paper, we propose a learning model that exploits this multiscale data structure and discuss its statistical and computational benefits. The hierarchical learning model is inspired by the logical and progressive easy-to-hard learning mechanism of human beings and has interpretable levels. The model apportions computational resources according to the complexity of data instances and target functions. This property can have multiple benefits, including higher inference speed and computational savings in training a model for many users or when training is interrupted. We provide a statistical analysis of the learning mechanism using multiscale entropies and show that it can yield significantly stronger guarantees than uniform convergence bounds.
translated by 谷歌翻译
Deep neural networks may easily memorize noisy labels present in real-world data, which degrades their ability to generalize. It is therefore important to track and evaluate the robustness of models against noisy label memorization. We propose a metric, called susceptibility, to gauge such memorization for neural networks. Susceptibility is simple and easy to compute during training. Moreover, it does not require access to ground-truth labels and it only uses unlabeled data. We empirically show the effectiveness of our metric in tracking memorization on various architectures and datasets and provide theoretical insights into the design of the susceptibility metric. Finally, we show through extensive experiments on datasets with synthetic and real-world label noise that one can utilize susceptibility and the overall training accuracy to distinguish models that maintain a low memorization on the training set and generalize well to unseen clean data.
translated by 谷歌翻译
Time series is the most prevalent form of input data for educational prediction tasks. The vast majority of research using time series data focuses on hand-crafted features, designed by experts for predictive performance and interpretability. However, extracting these features is labor-intensive for humans and computers. In this paper, we propose an approach that utilizes irregular multivariate time series modeling with graph neural networks to achieve comparable or better accuracy with raw time series clickstreams in comparison to hand-crafted features. Furthermore, we extend concept activation vectors for interpretability in raw time series models. We analyze these advances in the education domain, addressing the task of early student performance prediction for downstream targeted interventions and instructional support. Our experimental analysis on 23 MOOCs with millions of combined interactions over six behavioral dimensions show that models designed with our approach can (i) beat state-of-the-art educational time series baselines with no feature extraction and (ii) provide interpretable insights for personalized interventions. Source code: https://github.com/epfl-ml4ed/ripple/.
translated by 谷歌翻译
Cross-domain graph anomaly detection (CD-GAD) describes the problem of detecting anomalous nodes in an unlabelled target graph using auxiliary, related source graphs with labelled anomalous and normal nodes. Although it presents a promising approach to address the notoriously high false positive issue in anomaly detection, little work has been done in this line of research. There are numerous domain adaptation methods in the literature, but it is difficult to adapt them for GAD due to the unknown distributions of the anomalies and the complex node relations embedded in graph data. To this end, we introduce a novel domain adaptation approach, namely Anomaly-aware Contrastive alignmenT (ACT), for GAD. ACT is designed to jointly optimise: (i) unsupervised contrastive learning of normal representations of nodes in the target graph, and (ii) anomaly-aware one-class alignment that aligns these contrastive node representations and the representations of labelled normal nodes in the source graph, while enforcing significant deviation of the representations of the normal nodes from the labelled anomalous nodes in the source graph. In doing so, ACT effectively transfers anomaly-informed knowledge from the source graph to learn the complex node relations of the normal class for GAD on the target graph without any specification of the anomaly distributions. Extensive experiments on eight CD-GAD settings demonstrate that our approach ACT achieves substantially improved detection performance over 10 state-of-the-art GAD methods. Code is available at https://github.com/QZ-WANG/ACT.
translated by 谷歌翻译
Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.
translated by 谷歌翻译
本文提议使用修改的完全连接层转移初始化,以进行1900诊断。卷积神经网络(CNN)在图像分类中取得了显着的结果。但是,由于图像识别应用程序的复杂性,培训高性能模型是一个非常复杂且耗时的过程。另一方面,转移学习是一种相对较新的学习方法,已在许多领域使用,以减少计算来实现良好的性能。在这项研究中,Pytorch预训练的模型(VGG19 \ _bn和WideresNet -101)首次在MNIST数据集中应用于初始化,并具有修改的完全连接的层。先前在Imagenet中对使用的Pytorch预培训模型进行了培训。提出的模型在Kaggle笔记本电脑中得到了开发和验证,并且在网络培训过程中没有花费巨大的计算时间,达到了99.77%的出色精度。我们还将相同的方法应用于SIIM-FISABIO-RSNA COVID-19检测数据集,并达到80.01%的精度。相比之下,以前的方法在训练过程中需要大量的压缩时间才能达到高性能模型。代码可在以下链接上找到:github.com/dipuk0506/spinalnet
translated by 谷歌翻译
现有的多方对话数据集用于核心分辨率是新生的,许多挑战仍然没有解决。我们根据电视成绩单为此任务创建了一个大规模数据集,多语言多方CoreF(MMC)。由于使用多种语言的黄金质量字幕可用,我们建议重复注释以通过注释投影以其他语言(中文和Farsi)创建银色核心数据。在黄金(英语)数据上,现成的模型在MMC上的性能相对较差,这表明MMC比以前的数据集更广泛地覆盖多方核心。在银数据上,我们发现成功使用它进行数据增强和从头开始训练,这有效地模拟了零击的跨语性设置。
translated by 谷歌翻译
网络控制系统是反馈控制系统,具有通过通信网络连接的不同位置分布的系统组件。由于通信网络是通过Internet进行的,并且存在带宽和数据包大小的限制,因此会出现网络限制。其中一些约束是时间延迟和数据包损失。这些网络限制会降低性能,甚至破坏系统的稳定。为了克服这些通信约束的不利影响,已经开发了各种方法,其中一种代表性是网络预测控制。该方法提出了一个控制器,该控制器会积极补偿网络时间延迟和数据包损耗。本文旨在实施网络预测控制系统,以通过计算机网络控制机器人组。网络延迟由预测变量解释,而使用冗余控制数据包可以减少数据包丢失的潜力。尽管延迟且数据包损失很大,但结果将显示系统的稳定性。此外,将提出对先前网络预测控制系统的改进,并显示性能的提高。最后,将研究不同系统和环境参数对控制循环的影响。
translated by 谷歌翻译
图像的持久性拓扑特性是一个附加描述符,提供了传统神经网络可能无法发现的见解。该领域的现有研究主要侧重于有效地将数据的拓扑特性整合到学习过程中,以增强性能。但是,没有现有的研究来证明引入拓扑特性可以提高或损害性能的所有可能场景。本文对拓扑特性在各种培训方案中的图像分类有效性进行了详细分析,定义为:训练样本的数量,训练数据的复杂性和骨干网络的复杂性。我们确定从拓扑功能中受益最大的场景,例如,在小数据集中培训简单的网络。此外,我们讨论了数据集的拓扑一致性问题,该问题是使用拓扑特征进行分类的主要瓶颈之一。我们进一步证明了拓扑不一致如何损害某些情况的性能。
translated by 谷歌翻译